Nuclear Weapon Tests - Data Visualization

The goal is to present nuclear weapon test data on a map. But first, an extensive data cleaning was needed, see document NuclearWeaponTests_Cleaning.Rmd. Together with the map, different plots are presented to present differences and similarities between tests by different testing parties.

Import libraries and set constants

Import libraries

#install.packages("plotly")  # needed only if plotly has not been installed already
library(plotly)
library(ggplot2)
library(dplyr)

Set some constants for the number of rows shown in tables.

ROWS = 30 # how many data rows are displayed 

Reading in data

Reading in data. Data is downloaded from http://nuclearweaponarchive.org/Library/Catalog, and it has been saved as nuclearweapondata.txt. Updated version can be found from https://www.batguano.com/nuclear/nuccatalog.html.

df_data <- read.csv("nuclearweapondata_cleaned.csv")

The following data is presented in the file:

  • Index
  • Date
  • Test Party
  • Test Party Description
  • Site
  • Site Description
  • Test Type
  • Test Type Description
  • Seismic Body
  • Yield
  • Latitude
  • Longitude
  • Purpose
  • Purpose Description
  • Device
  • Device Description
  • Name

A glipse of the data to confirm everything looks normal:

head(df_data, 10) 
##     X       Date TestParty TestParty_Descr Site
## 1   1 1945-07-16        US   United States  ANM
## 2   2 1945-08-05        US   United States  HRJ
## 3   3 1945-08-09        US   United States  NGJ
## 4   4 1946-06-30        US   United States  BKN
## 5   5 1946-07-24        US   United States  BKN
## 6   6 1948-04-14        US   United States  ENW
## 7   7 1948-04-30        US   United States  ENW
## 8   8 1948-05-14        US   United States  ENW
## 9   9 1949-08-29        CP    Soviet Union <NA>
## 10 10 1951-01-27        US   United States NTSF
##                          Site_Descr TestType TestType_Descr SeismBody
## 1            Alamogordo, New Mexico     TOWR          Tower        NA
## 2                  Hiroshima, Japan     AIRD        Airdrop        NA
## 3                   Nagasaki, Japan     AIRD        Airdrop        NA
## 4    Bikini Atoll, Marshall Islands     AIRD        Airdrop        NA
## 5    Bikini Atoll, Marshall Islands     UNDW     Underwater        NA
## 6  Enewetak Atoll, Marshall Islands     TOWR          Tower        NA
## 7  Enewetak Atoll, Marshall Islands     TOWR          Tower        NA
## 8  Enewetak Atoll, Marshall Islands     TOWR          Tower        NA
## 9                              <NA>     TOWR          Tower        NA
## 10 Nevada Test Site, Frenchman Flat     AIRD        Airdrop        NA
##    Yield      Lat       Lng Purpose   Purpose_Descr Device   Device_Descr
## 1     21 33.67500 -106.4750      WR Weapons related      P Primarly Pu239
## 2     15       NA        NA      **             War      U  Primarly U235
## 3     21       NA        NA      **             War      P Primarly Pu239
## 4     21 11.00000  165.0000      WE Weapons effects      P Primarly Pu239
## 5     21 11.04471  165.1054      WE Weapons effects      P Primarly Pu239
## 6     37 11.00000  162.0000      WR Weapons related   <NA>           <NA>
## 7     49 11.00000  162.0000      WR Weapons related   <NA>           <NA>
## 8     18 11.00000  162.0000      WR Weapons related      U  Primarly U235
## 9     NA 48.00000   76.0000    <NA>            <NA>      P Primarly Pu239
## 10     1 37.00000 -116.0000      WR Weapons related   <NA>           <NA>
##        Name
## 1   TRINITY
## 2  LITTLEBO
## 3   FAT MAN
## 4      ABLE
## 5     BAKER
## 6      XRAY
## 7      YOKE
## 8     ZEBRA
## 9     JOE 1
## 10     ABLE

Modify time for plots

Change date to POSIXct format and add a year column

df_data$Date <- as.POSIXct(df_data$Date)
df_data$Year <- format(df_data$Date,"%Y")
head(df_data$Year, 10) 
##  [1] "1945" "1945" "1945" "1946" "1946" "1948" "1948" "1948" "1949" "1951"

Create Plots

Test Locations on Map

Create text to show when hovering on map.

df_data$hover <- with(df_data, paste("Test Site:", Site_Descr, "<br>", "Date:", Date, '<br>', 
                                     "Test party:", TestParty_Descr, "<br>", "Name:", Name, '<br>', 
                                     "Device:", Device_Descr, "<br>", "Yield:", Yield, "<br>",
                                     "Purpose:", Purpose_Descr, "<br>", "Type:", TestType_Descr
                                      ))

Print map:

g <- list( # define map elements  
  showland = TRUE,
  landcolor = "White",
  showocean = TRUE,
  oceancolor = "LightBlue",
  showlakes = TRUE,
  lakecolor = "LightBlue",
  showrivers = TRUE,
  rivercolor = "LightBlue",
  showcountries = TRUE,
  countrycolor = "DarkGray",
  resolution = 50,
  projection = list( type = 'natural earth')
)
# define data shown on map 
fig <- plot_geo(df_data, lat = ~Lat, lon = ~Lng)
fig <- fig %>% add_markers(text = df_data$hover, 
                           color = df_data$TestParty_Descr,
                           size = I(25))
# define layout 
fig <- fig %>% layout(
    title = 'Nuclear tests', 
    geo = g
  )
fig
## Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
## Please use `arrange()` instead.
## See vignette('programming') for more help
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Warning: Ignoring 117 observations

Note that most locations on the map are approximate.

Tests by Testing Party

Get frequencies for the plot; Get number of tests by testing party

totCounts <- data.frame(df_data %>% group_by(TestParty_Descr) %>% summarise(NoEvents = n()))
totCounts <- na.omit(totCounts) # remove NaNs
totCounts
##     TestParty_Descr NoEvents
## 1             China       46
## 2            France      200
## 3             India        1
## 4 Israel (Putative)        1
## 5      Soviet Union      587
## 6    United Kingdom       44
## 7     United States     1147

Create plot

fig <- plot_ly(totCounts, labels = ~TestParty_Descr, values = ~NoEvents, type = 'pie', 
               textinfo = "percent", hoverinfo = 'text', textposition = 'inside', 
               text = ~paste(TestParty_Descr, ': ', NoEvents))
fig <- fig %>% layout(title = 'Nuclear Tests by Testing Party')
fig

Tests by Year and Testing Party

Get frequencies for the plot; Get tests by year and testing party.

eventCounts <- data.frame(df_data %>% group_by(Year,TestParty_Descr) %>% summarise(NoEvents = n()))
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
head(eventCounts, 10)
##    Year TestParty_Descr NoEvents
## 1  1945   United States        3
## 2  1946   United States        2
## 3  1948   United States        3
## 4  1949    Soviet Union        1
## 5  1951    Soviet Union        2
## 6  1951   United States       16
## 7  1952  United Kingdom        1
## 8  1952   United States       10
## 9  1953    Soviet Union        2
## 10 1953  United Kingdom        2

Create plot

p <- ggplot(eventCounts, aes(fill=TestParty_Descr, y=NoEvents, x=Year)) + 
     geom_bar(position="stack", stat="identity", width = 0.9) + 
     theme(legend.title = element_blank())
fig <- ggplotly(p)
fig <- fig %>% layout(title = 'Number of Nuclear Tests Yearly',
                      xaxis = list(tickangle = 90, title = ""),
                      yaxis = list(title = ""))
fig

Test Types by Testing Party

Get tests by device type and testing party

df_testTypes <- data.frame(df_data %>% group_by(TestType_Descr,TestParty_Descr) %>% summarise(noTestTypes = n()))
## `summarise()` has grouped output by 'TestType_Descr'. You can override using the `.groups` argument.
head(df_testTypes, 10)
##     TestType_Descr TestParty_Descr noTestTypes
## 1          Airdrop           China           7
## 2          Airdrop          France           3
## 3          Airdrop    Soviet Union           2
## 4          Airdrop  United Kingdom           8
## 5          Airdrop   United States          54
## 6  Artillery Shell   United States           1
## 7       Atmosphere           China          13
## 8       Atmosphere          France          20
## 9       Atmosphere    Soviet Union         112
## 10         Balloon          France          20

Create plot

p <- ggplot(df_testTypes, aes(fill=TestParty_Descr, y=noTestTypes, x=TestType_Descr)) + 
     geom_bar(position="stack", stat="identity", width = 0.9) +
     theme(legend.title = element_blank())
fig <- ggplotly(p)
fig <- fig %>% layout(title = 'Number of Tests by Type and Testing Party',
                      xaxis = list(tickangle = 45,title = ""),  
                      yaxis = list(title = ""))
fig

Test Purposes by Testing Party

Get tests by device type and testing party

df_purposes <- data.frame(df_data %>% group_by(Purpose_Descr,TestParty_Descr) %>% summarise(noPurposes = n()))
## `summarise()` has grouped output by 'Purpose_Descr'. You can override using the `.groups` argument.
head(df_purposes, 10)
##             Purpose_Descr TestParty_Descr noPurposes
## 1  Peaceful (engineering)    Soviet Union         12
## 2  Peaceful (engineering)   United States         36
## 3                  Safety   United States         26
## 4       Seismic detection   United States          7
## 5                     War   United States          2
## 6         Weapons effects          France          2
## 7         Weapons effects   United States        101
## 8         Weapons related          France        159
## 9         Weapons related  United Kingdom          2
## 10        Weapons related   United States        956

Create plot

p <- ggplot(df_purposes, aes(fill=TestParty_Descr, y=noPurposes, x=Purpose_Descr)) + 
     geom_bar(position="stack", stat="identity", width = 0.9) +
     theme(legend.title = element_blank())
fig <- ggplotly(p)
fig <- fig %>% layout(title = 'Number of Tests by Purpose and Testing Party',
                      xaxis = list(tickangle = 45, title = ""),
                      yaxis = list(title = ""))
fig